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Description 



DNA, RNA AND A PROTEIN USEFUL FOR DETECTION OF A MYCOBACTERIAL INFECTION 

Description 
Technical field 

The invention is in the field of clinical medicine, molecular biology and genetic 
engineering. More particularly, it relates to the molecular methods of 
tuberculosis diagnosis using newly identified DNA sequences which can be used 
as probes for DNA hybridization and or for DNA amplification leading to the 
identification of pathogenic mycobacteria causing disease in humans and 
animals. 

Backg round 

Tuberculosis, an infectious disease mainly caused by respiratory infection with 
Mycobacterium tuberculosis, represents an important subject of multidisciplinary 
investigation owing to the urgent need for rapid and reliable diagnostic tests and 
effective vaccines for disease control. 

An estimated 8 million persons are developing tuberculosis each year and this 
number will be rising for the foreseeable future. Especially immuno- 
comprimised people, e.g. Human Immunodeficiency Virus-infected individuals 
(Selwyn et al., 1989; Barnes et al., 1991) and the population of countries with 
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insufficient public health systems (Grzybowski, 1991; Kochi, 1991) are the most 
endangered groups of this "global disease" (WHO, 1992). Emergence of multiple 
drug resistant strains is posing major threat to human health not only in 
developing countries, but also in developed countries. A rapid and specific 
diagnosis of tuberculosis is still a problem. 

One approach to address this problem is to use the specific humoral or cellular 
response of the host to infer the presence of disease. Mycobacteria are rich in 
antigens that stimulate the production of antibodies and serology is simple and 
readily applicable as a rapid diagnostic test (Wilkins, 1994). Unfortunately the 
usefulness of serological tests are often limited by their lack of specificity and by 

their inability to destinguish between active disease, prior sensitization by 
contact with Af. tuberculosis or cross-sensitization to other mycobacteria. 
Another means of achieving the correct diagnosis are to develop increasingly 
sensitve methods to detect the causative bacilli or their products. Such techniques 
include amplification of a defined region of bacterial DNA via polymersase 
chain reaction (PCR) (Shankar et ah, 1991), immunoassays for detecting antigen, 
gas liquid chromatography and mass spectrometry for detecting specific 
mycobacterial lipids. Of these, PCR is being evaluated most intensely and 
appears to hold greatest promise. 

Attempts have been made to develop methods for the detection of chromosomal 
DNA of the M. tuberculosis complex in patient's sputum (Glennon, 1994). While 
the possibility of developing a DNA probe to distinguish between the M 
tuberculosis complex and other mycobacterial strains has been reported, strain 
differentiation within the individual members of the complex is still a problem. 

In this study we report the isolation of novel genomic clones containing as yet 
unreported genes and DNA, and the identification of novel A/, tuberculosis 
chromosomal DNA regions specific for species of the M tuberculosis complex. 
In addition, amplification of (i) a 377 bp fragment specific for the M. 
tuberculosis complex and (ii) of a 380-bp fragment showing sequence 
similarities with the genome of Mycobacterium asiaticum, Mycobacterium 
gastrin Mycobacterium gordonae and Mycobacterium kansasii are described. The 
utility of the 377-bp and the 380-bp fragment for the differentation of species and 
strains of mycobacteria is reported. In addition to other ORF identified in this 
study, a novel ca. 15kDa recombinant protein showing high homology to a 
family of transposase was overproduced in Escherichia coli as a thioredoxin 
fusion and purified. The ca. 15kDa and ca. 3 lkDa proteins described in this study 
are different from the 35kDa ORF belonging to an insertion element identified by 
Marianietal.(1993). 
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Disclosure of invention 

The present invention is based on novel DNA sequences cloned from the 
genome of Mycobacterium tuberculosis, which can be used for strain 
differentiation and for the diagnosis of tuberculosis. 

Accordingly, the DNA sequences of the cloned fragments is an aspect of the 
invention. 

The cloned DNA fragments are found to code for at least 7 proteins of about 
9kDa, 15kDa, 17kDa,3IkDa, 55kDa, 74kDaand 77kDa, the sequences of which 
are another aspect of the invention. 

The use of the DNA sequence for detecting specific fragments by hybridization 
or by DNA amplification is another aspect of the invention. 

The use of the cloned DNA or of the proteins coded by the cloned DNA for the 
purpose of serology, skin testing, vaccine development or drug design is another 
aspect of the invention. 
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The object underlaying the invention is solved by the following 
three main embodiments with their preferred embodiments. 

According to a first embodiment the invention concerns a 
DNA 

(a) having sequence (I) according to figure 9, wherein 
optionally one or more condons can be replaced by condons coding 
for the same amino acid(s), 

(b) having a sequence complementary to said of (a) , 

(c) being single-stranded, wherein its strand is hybridizable 
with that of the DNA according to (a) or (b) , 

(d) being double -stranded, the sequences of its single strands 
being defined as in (a) and (b) , respectively, 

(e) being double -stranded, its single strands being hybridizable 
with those of the DNA according to (d) , or 

(f) being a subsequence of the sequences according to (a) to 
(e) . 

Further the invention concerns a DNA according to (c) or (e) , 
its single strands being hybridizable with those of the DNA 
according to (a), (b) , (d) and (f ) , respectively, at a 
temperature of at least 25 °C and at a concentration of NaCl of 
1 M. 

Further the invention concerns a RNA being a transcript of a DNA 
according to the invention. 

Further the invention concerns a protein being encoded by a DNA 
according to the invention. 

Further the invention concerns a protein having the amino a. d 
sequence (II) according to figure 13. 

The protein according to the invention can be an about 74 kDa 
protein. 
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Further the invention concerns a protein having the amino acid 
sequence (III) according to figure 14. 

The protein according to the invention can be an about 77 kDa 
protein. 

Further the invention concerns a protein having the amino acid 
sequence (IV) according to figure 15. 

The protein according to the invention can be an about 9 kDa 
protein. 

Further the invention concerns a protein having the amino acid 
sequence (V) according to figure 16. 

The protein according to the invention can be an about 55 kDa 
protein . 

The protein according to the invention can be a recombinant 
protein, especially a protein produced by means of a bacterial 
strain, a yeast strain, a fungal strain or a cell line of a 
higher eucaryote. 

The protein according to the invention can be encoded by a DNA 
sequence according to the first embodiment of the invention and 
can be recovered by a method comprising the following steps: 

(i) subjecting proteins encoded by said DNA sequence to a 
usual test for diagnosis of tuberculosis. 

(ii) selecting a protein showing an inhibitory effect and 

(iii) isolating and recovering said protein. 

Further the invention concerns a DNA, RNA or protein according 
to the invention which can be used for 

(i) diagnosis of tuberculosis in humans and animals and/or 
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(ii) diagnosis of other mycobacterial infections in humans or 
animals, 

each especially by means of samples taken from humans or 
animals. 

Further the invention concerns a use of a DNA according to the 
invention for the identification of mycobacteria in media 
samples . 

The foregoing use can comprise the steps of 

(i) isolating the mycobacterium, 

(ii) preparing crude or purified genomic DNA, 

(iii) hybridizing it to a DNA according to the first embodiment 
of the invention and 

(iv) detecting the fragment pattern using conventional methods 
such as a radioactivity assay, chemiluminiscence or 
fluorescence . 

Further the invention concerns a use, wherein as samples 
clinical samples are used. 

Further the invention concerns a use of a DNA according to the 
invention or of a protein according to the invention for 

(i) epidemeological purposes and/or 

(ii) vaccination follow-up 

for humans or animals suffering from mycobacterial infections, 
especially tuberculosis. 

Further the invention concerns a use of a DNA according to the 
invention or of a protein according to the invention for the 
development of drugs useful for combating mycobacterial 
infections of humans or animals, especially tuberculosis, 
especially for testing and recovering of substances inhibiting 
mycobacterial infections in humans and animals, especially 
tuberculosis. 
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According to a second embodiment the invention concerns a DNA 

(a) having sequence (VI) according to figure 2, wherein 
optionally one or more condons can be replaced by condons coding 
for the same amino acid(s), 

(b) having a sequence complementary to said of (a) , 

(c) being single-stranded, wherein its strand is hybridizable 
with that of the DNA according to (a) or (b) , 

(d) being double-stranded, the sequences of its single strands 
being defined as in (a) and (b) , respectively, 

(e) being double -stranded, its single strands being hybridizable 
with those of the DNA according to (d) , or 

(f) being a subsequence of the sequences according to (a) to 
(e) . 

Further the invention concerns a DNA according to (c) or (e) , 
its single strands being hybridizable with those of the DNA 
according to (a), (b) , (d) and (f ) , respectively, at a 

temperature of at least 25 °C and at a concentration of NaCl of 
1 M. 

Further the invention concerns a RNA being a transcript of a DNA 
according to the invention. 

Further the invention concerns a protein being encoded by a DNA 
according to the invention. 

Further the invention concerns a protein having the amino acid 
sequence (VII) according to figure 5. 

The protein according to the invention can be an about 15 kDa 
protein. 

Further the invention concerns a protein having the amino acid 
sequence (VIII) according to figure 6. 
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The protein according to the invention can be an about 31 JcDa 
protein. 

The protein according to the invention can be a recombinant 
protein, especially a protein produced by means of a bacterial 
strain, a yeast strain, a fungal strain or a cell line of a 
higher eucaryote. 

The protein according to the invention can be a encoded by a DNA 
sequence according to the second embodiment of the invention and 
can be recovered by a method comprising the following steps: 

(i) subjecting proteins encoded by said DNA sequence to a 
usual test for diagnosis of tuberculosis, 

(ii) selecting a protein showing an inhibitory effect and 

(iii) isolating and recovering said protein. 

Further the invention concerns a DNA, RNA or protein according 
to the invention which can be used for 

(i) diagnosis of tuberculosis in humans and animals and/or 

(ii) diagnosis of other mycobacterial infections in humans or 
animals, 

each especially by means of samples taken from humans or 
animals . 

Further the invention concerns a use of a DNA according to the 
invention for the identification of mycobacteria in media 
samples . 

The foregoing use can comprise the steps of 

(i) isolating the mycobacterium, 

(ii) preparing crude or purified genomic DNA, 

(iii) hybridizing it to a DNA according to the second embodiment 
of the invention and 
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(iv) detecting the fragment pattern using conventional methods 
such as a radioactivity assay, chemiluminiscence or 
fluorescence . 

Further the invention concerns a use, wherein as samples 
clinical samples are used. 

Further the invention concerns a use of a DNA according to the 
invention or of a protein according to the invention for 

(i) epidemeological purposes and/or 

(ii) vaccination follow-up 

for humans or animals suffering from mycobacterial infections, 
especially tuberculosis. 

Further the invention concerns a use of a DNA according to the 
invention or of a protein according to the invention for the 
development of drugs useful for combating mycobacterial 
infections of humans or animals, especially tuberculosis, 
especially for testing and recovering of substances inhibiting 
mycobacterial infections in humans and animals, especially 
tuberculosis. 

According to a third embodiment the invention concerns a DNA 

(a) having sequence (IX) according to figure 3, wherein 
optionally one or more condons can be replaced by condons coding 
for the same amino acid(s), 

(b) having a sequence complementary to said of (a) , 

(c) being single-stranded, wherein its strand is hybridizable 
with that of the DNA according to (a) or (b) , 

(d) being double-stranded, the sequences of its single strands 
being defined as in (a) and (b) , respectively, 

(e) being double -stranded, its single strands being hybridizable 
with those of the DNA according to (d) , or 

(f) being a subsequence of the sequences according to (a) to 
(e) . 

Further the invention concerns a DNA according to (c) or (e) , 
its single strands being hybridizable with those of the DNA 
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according to (a) , (b) , (d) and (f ) , respectively, at a 
temperature of at least 25 °C and at a concentration of NaCl of 
1 M. 

Further the invention concerns a RNA being a transcript of a DNA 
according to the invention. 

Further the invention concerns a protein being encoded by a DNA 
according to the invention. 

Further the invention concerns a protein having the amino acid 
sequence (X) according to figure 7. 

The protein according to the invention can be an about 17 JcDa 
protein. 

The protein according to the invention can be a recombinant 
protein, especially a protein produced by means of a bacterial 
strain, a yeast strain, a fungicidal strain or a cell line of a 
higher eucaryote . 

The protein according to the invention can be encoded by a DNA 
sequence according to the third embodiment of the invention and 
can be recovered by a method comprising the following steps: 

(i) subjecting proteins encoded by said DNA sequence to a 
usual test for diagnosis of tuberculosis, 

(ii) selecting a protein showing an inhibitory effect and 

(iii) isolating and recovering said protein. 

Further the invention concerns a DNA, RNA or protein according 
to the invention which can be used for 

(i) diagnosis of tuberculosis in humans and animals and/or 

(ii) diagnosis of other mycobacterial infections in humans or 
animals, 
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each especially by means of samples taken from humans or 
animals . 

Further the invention concerns a use of a DNA according to the 
invention for the identification of mycobacteria in media 
samples . 

The foregoing use can comprise the steps of 

(i) isolating the mycobacterium, 

(ii) preparing crude or purified genomic DNA, 

(iii) hybridizing it to a DNA according to the third embodiment 
of the invention and 

(iv) detecting the fragment pattern using conventional methods 
such as a radioactivity assay, chemiluminiscence or 
fluorescence . 

Further the invention concerns a use, wherein as samples 
clinical samples are used. 

Further the invention concerns a use of a DNA according to the 
invention or of a protein according to the invention for 

(i) epidemiological purposes and/or 

(ii) vaccination follow-up 

for humans or animals suffering from mycobacterial infections, 
especially tuberculosis. 

Further the invention concerns a use of a DNA according to the 
invention or of a protein according to the invention for the 
development of drugs useful for combating mycobacterial 
infections of humans or animals, especially tuberculosis, 
especially for testing and recovering of substances inhibiting 
mycobacterial infections in humans and animals, especially 
tuberculosis. 
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The invention is explained in detail by the following figures 
and experimental data. 

Fig. 1 shows a restriction endonuclease map of the 7.2 kb M. 
tuberculosis chromosomal region; 

Fig. 2. shows a 2253 bp M. tuberculosis chromosomal region 
including BamHI, EcoRI and Kpnl restriction sites and 
oligonucleotides for screening the lambda gt 11 Af. tuberculosis 
library (Primer 1 and Primer 2 underlined) and for amplification 
of the 377 bp region (377 bp region in bold, Primer 3 and Primer 
4 underlined) ; amino acid sequences of the about 15 kDa and the 
about 31 kDa proteins are shown above the DNA sequences and are 
marked with arrows (small arrow about 15 kDa ORF 1, strong arrow 
about 31 kDa ORF 2) ; 

Fig. 3. shows a DNA sequence of the 440 bp M. tuberculosis 
chromosomal region including the 380 bp region (in bold) used in 
PCR experiments and the amino acid sequence of the ORF 3 shown 
below the complementary DNA strand (< ORF 3) ; 

Fig. 4 is an overview of the isolated lambda gtll-clone C9-2; 
7.2 kb insert fragment, sequenced chromosomal regions and ORF 1, 
ORF 2 and ORF 3 marked with arrows; 

Fig. 5 shows the amino acid sequence of the about 15 kDa protein 
(ORF 1) ; 

Fig. 6 shows the amino acid sequence of the about 31 kDa protein 
(ORF 2) ; 

Fig. 7 shows the amino acid sequence of the about 17 kDa 
protein; 
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Pig. 8 A shows SDS-PAGE of the insoluble pellet fraction (lane 
1) and the purified about 15 kDa recombinant antigen (lane 2); 
lane A3 shows protein molecular weight standards (2.850 to 
43.000 molecular weight range); 

Fig 8 B shows SDS-PAGE of the purified about 15 kDa thioredoxin 
fusion protein (lane 1) and the two protein bands obtained after 
enterokinase cleavage (lane 1) ; 

Pig. 9 shows a DNA sequence of M. tuberculosis; 

Fig. 10 is a schematic drawing of the clone Mtub-Clara-Klon; the 
open reading frames of about 9 kDa (bp 3536 to bp 3829), 55 kDa 
(bp 2111 to bp 3829) , 74 kDa (bp 1538 to bp 3829) and 77 kDa (bp 
2698 to bp 2 on the complimentary strand) proteins are shown by 
arrows and the corresponding coding regions are numbered; 

Fig. 11 A shows are southern hybridization with genomic DNA from 
different mycobacteria digested with PvuII (l: M. tuberculosis 
H37Rv; 2: M. avium; 3: M. kanssasi; 4: M. necroti; 5: M. 
fortuitum; 6: M. phlei; 7: M. swegmatis; 8: M. vaccae) ; 

Fig. 11 B shows a finger-print obtained using the DNA (BamHI 
digest) of (1) M. tuberculsosis H37 rv, (2) M. tuberculosis H37 
Ra, (3) M. bovis BCG, and (4) M. tuberculosis H37Rv digested 
with Sail; 

Fig. 12 shows a finger-print with DNA from different M. 
tuberculosis clinical isolates (numbered 1 to 12) digested with 
PvuII restriction enzyme; the 4 kb Sal I fragment (Mtub-Klar- 
Klon) was used as probe; 

Pig. 13 shows an amino acid sequence of the protein of about 74 
kDa (molecular weight 74 999, length 764) 



Pig. 14 shows a glycine rich protein of about 77 kDa (molecular 
weight 77056, length 899); 
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Pig. 15 shows the amino acid sequence of the about 9 kDa proline 
rich protein (molecular weight 93 56, length 98); and 

Fig. 16 shows the proline rich protein of about 55 kDa 
(molecular weight 55982, length 573) . 
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Modes for Carrying out the invention 

We were interested in identifying and cloning novel DNA sequences from the 
genome of Mycobacterium tuberculosis for use in rapid and specific diagnosis 
of tuberculosis. Our strategy was to go for new repeated elements and insertion 
elements which are present only in M.tuberculosis or in the strains of M. 
tuberculosis complex. 



Examples 

The following examples further describe the isolation and sequencing of M. 
tuberculosis-DNA containing putative IS-element (Insertion Element) and repeat 
sequences, e.g., PGRS-elements (Polymorphic GC-Rich-Sequences) and the use 
of the as yet unreported DNA sequences for strain identification and diagnosis 
of tuberculosis. 



Escherichia coli strains, phages and plasmids: The Escherichia coli K12 strain 
Y1090r - (Huynh et al., 1985) was used to propagate the Xgtl 1 library and the E. 
coli K12 strain GI724 (Invitrogen, Leek, The Netherlands) was the host for the 
production of the ca. 15kDa protein fused to thioredoxin. 

The recombinant DNA library of At. tuberculosis genomic DNA in the Xgtl 1 ex- 
pression vector was constructed by Young et al. (1985). 

The plasmid vector pTrxFus (Invitrogen, Leek, The Netherlands) was used to 
make an in-frame fusion with thioredoxin as an amino-terminal fusion partner. 

Mycobacterial strains and preparation of cell extracts: The mycobacterial strains 
used in this study are shown in Table 1 (Results and Discussion). All organisms 
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were grown on Loewenstein medium. For preparing cell extracts a loop of bacteria was 
suspended in 0.5 ml of 10 mM Tris/base, 1 mM EDTA (pH 7.4) followed by addition of 0.5 ml 
glass beads (150-212 microns, Sigma, Deisenhofen, Germany). The suspension was incubated 
at 80°C for 10 min followed by a I min treatment in a Mini-Bead Beater (Biospec Products). 

DNA sequence analysis: Similarity comparisons were done using the BLAST program 
(Pearson and Lipman, 1988; NCBI computing facility). 

All DNA manupulations were done according to standard procedures (see Maniatis et al. 

1982). 

DNA sequencing: DNA sequencing analysis was performed by the dideoxynucleotide- 
chain termination method using a PCR sequencing kit (ABI PRISM T M Dye Terminator Cycle 
Sequencing Ready Reaction Kit, Perkin Elmer, Warrington, Great Britain) on a 373A DNA 
Sequencer (Applied Biosystems, Warrington, Great Britain). DNA sequences were determined 
for both strands by primer walking. 

1. Clone containing putative IS-Element 
1 . 1 Isolation of the clone C9-2 containing a putative IS element: 
In our attempt to isolate new mycobacterial insertion elements, a Xgtl 1 M. tuberculosis library 
was screened with oligodeoxyribonucleotide primers based on conserved regions of different 
insertion elements.The library was screened as described by Young and Davis (1985). Briefly, 
phage- infected cells of the strain E.coli Y1090r " were plated in top agar on Luria-Bertani plates 

(7.0 x 10 6 PFU per 85 mm plate) and incubated for 6-8 h at 42°C. Nylon membranes (Biodyne 
B Transfer Membrane, 0.45 nm, Pall, Portsmouth, England) were overlaid on plates. The filters 
were treated with 0.5 N NaOH, 1.5 M NaCl and the DNA was fixed via UV-crosslinking. 
Screening was performed using 3*-end labeled oligonucleotides of the sequence 5'- 
TGACGCGAGTGGGTGTGATTTCG-3' and S^GTGGTCGAGCCGTTGATGCCG-J (Fig.2, 
PRIMER 1 and PRIMER 2) . Digoxigenin-labeling of the oligodeoxyribonucleotide primers 
was carried out using a DIG Oligonucleotide 3 -End Labeling Kit (Boehringer Mannheim, 
Germany). Hybridzation was done at 45°C in hybridization buffer (Boehringer Mannheim, 
Germany) overnight. Then the membranes were washed under stringent conditions for 5 min 
twice in 2 x SSC, 0.1% SDS and for 15 min twice at 37°C in 0.1 x SSC, 0.1% SDS. Chemi- 
luminescent detection was carried out with the help of a DIG Luminescent Detection Kit 
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(Boehringer Mannheim, Germany). Plaques were purified by three rounds of plating to obtain 
single plaques. Phage DNA was isolated using a Nucleobond AX L50 Kit (Machery-Nagel, 
Dtiren, Germany) and restriction mapping of the selected clone was performed by standard 
procedures (Maniatis et al., 1982). 

Several positive clones were obtained. Detailed analysis of one of the clones (C9-2) is 
presented here. The recombinant phage was mapped with the restriction endonucleases BamHI, 
EcoRI and Kpnl (Fig. 1). EcoRI digestion revealed a 7.2 kb DNA insert fragment. 

1 .2 DNA sequencing of the cloned fragment: 

Two M. tuberculosis chromosomal regions of 2253-bp and 440-bp of this fragment were 
sequenced (Fig.2 and Fig.3). DNA sequencing of the 2253-bp region revealed the presence of a 
putative insertion element between bp 401 and bp 1378 containing inverted repeats flanked by 
duplications of 4 base pairs. The cloned fragment reported here is novel and is located at a 
different position than the 2.1 kb Pstl/EcoRI fragment reported by Mariani et al. (1993), 
because the DNA sequence of the adjoining regions on the left and the right ends of the putative 
IS-element were completely different in our clone C9-2 as compared to that reported by Mariani 
etal.(1993). 

Fig.4 gives an overview of the 7.2 kb insert fragment and the sequenced chromosomal regions. 

1.3 Novel Proteins coded by the cloned DNA: 

During the molecular characterization of the clone, novel ORFs were identified. The complete 
ORF of the ca. 15kDa protein is located on the 2253-bp fragment coded by a 408-bp fragment, 
corresponding to a coding capacity of 136 amino acids. The ca. 15kDa protein (Fig.5) is a novel 
product showing limited homology in the N-terminus of a 34kDa ORF reported by Mariani et 
al. (1993). We also identified an ORF of about 31kDa (Fig. 2 and Fig. 6) coded by the cloned 
DNA (bp 515 till bp 1378). This 31kDa ORF did not show any homology in the N-terminus to 
any known sequence in the database. The C-terminus of the ca. 31kDa protein showed 
homology to a 34kDa ORF (Mariani et al., 1993). We have not used the DNA sequence 
showing homology to the sequence reported by Mariani et al. (1993) as far as the claims of this 
patent application are concerned. An ORF (ORF 3, Fig. 3 and Fig.7) on the complementary 
strand to the 3'-end of the insert fragment of the recombinant X-clone C9-2 was identified, 
which had not been reported earlier. This sequence showed homology to a family of 
transcription regulators in microorganism. In addition, some homology was observed with a 
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putative two-component system mtrA-mtrB isolated from Mtuberculosis H37Rv (Via et ah, 
1996) and to PhoP of Bacillus subtilis (Lee and Hulett, 1992). Based on this data, the DNA 
sequence (440-bp fragment, Fig. 3) and the derived polypeptide might play a role in regulation 
of virulence in mycobacteria. 

1.4 Cloning, expression and purification of the ca. 15kDa protein fused to 
thioredoxin 

The Xgtl 1 clone C9-2 (Fig. 4) was used as template to amplify a PCR fragment of 951- 
bp (Fig. 2, sequence position 451-1378) including the ORF for the ca. 15kDa protein (Fig. 5) 
and cleavage sites for the restriction endonucleases Smal and Sail at the 5- and Spends. 
Amplification of the Smal-Sall mycobacterial DNA fragment for insertion into pTrxFus 
(Invitrogen. Leek, The Netherlands) was done using the oligonucleotide primers with the 
sequence 5'-TCTAGACATATGACGCGAGTGGGTGTGATTTCG-3* (PRIMER 7, forward) 
and 5'-CATATGGTCGACCTAGGGCGTGTCTCCCAA-3' (PRIMER 8, reverse) 
corresponding to sequence positions 451-474 and 1378-1361 (Fig. 2). Composition of the 
reaction mix was the same as described above with 400 ng phage DNA as template. The probe 
was amplified in 30 cycles consisting of the same conditions as described. Cleavage sites were 
introduced by appropriate primers. After digestion with both restriction endonucleases the 
product was inserted in pTrxFus (Invitrogen, Leek, The Netherlands) to form the plasmid 
pCH3-8. 

The E. coli strain GI724 was electroporated with the plasmid pCH3-8. Bacterial cultures (200 
ml of Induction Medium (Invitrogen, Leek, The Netherlands) supplemented with 100 ^ml 
ampicillin) grown at 30°C were induced to synthesize the fusion protein by tryptophan addition 
(100^g/ml) and temperature shift to 37°C. Cells were collected after 4 hours (10 000 x g, 5 min, 
4°C), resuspended in 4 ml Osmotic Shock Solution (Invitrogen, Leek, The Netherlands), broken 
by three rounds of alternate sonication on ice (10 sec.) and shock freezing in liquid nitrogen, 
and pelleted (10 000 x g, 15 min, 4°C). Most of the fusion protein accumulated in the form of 
inclusion bodies and only a small fraction was present as soluble protein inside the cells.The 
pellet containing the inclusion bodies was resuspended (denaturation) in 10 ml 6 M 
guanidine/HCl (pH 8.5), incubated for 2 hours at room temperature and pelleted again (10 000 x 
g, 30 min, 4°C). The recombinant fusion protein was refolded by dialysing against 50 mM 
Tris/HCl (pH 8.0). Anion exchange chromatography was done with the help of a BioCAD 
perfusion system (Perseptive Biosystems) on a Poros column HQ/M (Perseptive Biosystems). 
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Elution was performed using a linear NaCl gradient (0-1M). The fusion protein concentration 
was determined with the BioRad Protein Assay Kit (BioRad, Munich, Germany). Purity was 
assessed by densitometry (Molecular Dynamics, Software Image Quant) and analytical SDS- 
PAGE and coomassie staining. 

The ca. 1 5kDa protein fused to thioredoxin was refolded as described above. Further 
purification of the ca. 15kDa protein fused to thioredoxin was carried out by anion exchange 
chromatography (Fig. 8, A lane 3 and B lanel). After enterokinase cleavage of the purified ca. 
15kDa protein fused to thioredoxin two protein bands were detectable on SDS-PAGE (Fig. 8, 
lane 2). By western blotting with a thioredoxin monoclonal antibody the lower 1 lkDa band was 
identified to be thioredoxin. The upper band corresponds to the ca. 1 5kDa recombinant protein 
of M. tuberculosis. This is the first report of expression and purification of the ca. 15kDa 
protein of M. tuberculosis in E. coli. 

1 .4. Species specific diagnosis of mycobacteria : 
Deprotected and desalted Oligonucleotide primers were obtained from Gibco BRL (Eggenstein, 
Germany) or Eurogentec (Seraing, Belgium). 

The oligodeoxyribonucleotide primers with the sequence 5-GTCCATGTGCCGCCG 
CTG-3' (PRIMER 3, forward) and 5'-CTGCGCGGCTCCCGGCA-3' (PRIMER 4, reverse), 
specific for the DNA regions of the 2253-bp M. tuberculosis chromosomal region shown in Fig. 
2 were used in PCR experiments to amplify a 377-bp fragment. 

For amplification of a 380-bp fragment from the 440-bp chromosomal fragment, the 
oligodeoxyribonucleotide primers with the sequences 5-CGAGGCTGAACGGCT TTG-3' 
(PRIMER 5, forward) and 5'-TCAACGTCCGCGGCAAGC-3' (PRIMER 6, reverse) 
corresponding to the DNA region shown in Fig. 3 were used. Amplifications were performed in 
0.2 ml Micro Amp Reaction Tubes (Perkin Elmer, Norwalk, Connecticut, USA) in a final 
volume of 100 ul using a GeneAmp® PCR Kit (Perkin Elmer, Branchburg, New Jersey, USA). 
Reaction mixtures contained 10 mM Tris/HCl (pH 8.3), 50 mM KC1, 3 mM MgCl 2 , 200 W M 
dNTP, 0.1 uM Primer, 30-100 ng chromosomal DNA from mycobacterial cell extracts (Table 
I) and 2.5 U AmpliTaq® DNA polymerase. All components of a PCR reaction except for the 
template are included in the Kit. The reactions were performed using the automated Thermal 
Cycler Gene Amp PCR System 9600 (Perkin Elmer, Norwalk, Connecticut, USA). The samples 
were amplified by 40 cycles consisting of denaturation at 96°C for 2 min, annealing of the 
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primers at 25°C for 1 min and primer extension at 72°C for 3 min. 

After amplification, 10 \i\ of each product was electrophoresed in a horizontal 1 .5% agarose 
gel. Gels were precasted using a 1:10 000 dilution of SYBR Green I stock reagent (Eugene, 
Leiden, The Netherlands) in 10 mM Tris/HCl, 1 mM EDTA (pH 8.0). 

For DNA sequencing the appropriate 377-bp and 380-bp PCR products from the mycobacterial 
cell extract samples (Table I) were purified from an 1.5% agarose gel using a Gel Extraction 
Kit (QI AGEN, Hilden, Germany). 

1.4.1. The 377-bp region: 

The 377-bp region (Fig.2) of the isolated and sequenced 2253-bp M. tuberculosis chromosomal 
fragment and the 380-bp region (Fig.3) of the identified 440-bp chromosomal fragment were 
examined for their suitability for strain differentiation (Table 1). A PCR-product of the 
predicted size and a 100% DNA sequence homology in the 377-bp region was detected only in 
the members of the M. tuberculosis complex. No amplification product was obtained from other 
mycobacteria (Table 1). Therefore, the PCR primers of the 377-bp region are useful for the 
rapid discrimination of M. tuberculosis complex (M. tuberculosis, Mycobacterium bovis, 
Mycobacterium bovis BCG, Mycobacterium africanum and Mycobacterium microti) from 
other mycobacteria. 

1.3.2. The 380-bp region: 

A predominant amplification product of correct size of the 380-bp region was obtained from the 
chromosomal DNA samples of the M. tuberculosis complex including the vaccine strain M. 
tuberculosis BCG, the tuberculosis isolate Tubl 18 and the mycobacterial species M. asiaticum, 
M. gastri, M. gordonae and M. kansasii. Thus, this fragment can be used for the identification of 
above mycobacterial species, since no amplification product was obtained from other 
mycobacterial species (Table 1). 

2. Clone containing PGRS Element 
2.1. Cloning of DNA fragment containing PGRS elements: 

We screened Lawrist cosmid library of M. tuberculosis DNA using a degenerate 
oligonucleotide of the sequence 5'- 

C/GGCC/GGCC/GGGC/GACC/GGGC/GGGC/GGCCGGCTCC/GGG- 3' which was designed 
in such a way that it contained GC rich regions as well as it coded for a putative proline rich 
polypeptide. Colony hybridization using labelled oligonucleotide was performed using standard 
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procedures (Maniatis et al.1982). Filters were prehybridized and probed at 42°C overnight in a 
solution containing 6xSSC, 1 raM Sodium phosphate, ImM EDTA, 0.05% skimmed milk, 
0.5%SDS. Filters were washed twice in 2xSSC;0,3%SDS for 1 5 min at 65°C. 
First screeinig yielded six positive clones which were recheked by hybridization with the 
oligonucleotide. Three clones gave strong signal and restriction mapping of the clones showed 
identical restriction pattern. Further restriction mapping and Southern hybridization of one of 
the clones called identified an about 4kb Sail fragment that hybridized strongly to the 
oligonucleotide. 

2.2. DNA sequencing of the cloned fragment: The ca. 4kb Sail fragment was 
subcloned in pUC19 and the clone was named Mtub-Clara-Klon. Entire insert was sequenced 
by primer walking method. The DNA sequence is presented in figure 9. There were unusual 
difficulties in obtaining the sequence of the recobminant clone because of the high GC rich 
content and due to the presence of unusual repeats. 

2.3. Proteins coded by the cloned DNA: 

We identified at least 4 ORF (open reading frames) belonging to a ca. 9kDa, 55kDa, 74kDa and 
a 77kDa protein (Fig. 10). Interestingly, the amino acid sequence of the 9kDa, 55kDa ,74kDa 
and the 77kDa proteins didnot show strong homology to any sequences reported so far for 
Mycobacteria (Genbank and Swissprot Databases) . In addition, the 9kDa, 55kDa and the 
74kDa proteins have an unusually high content proline, nevertheless, no strong homology with 
the known proline rich antigens ( Laqueyrerie et al. 1995; Infect.Immun.63,4003) of 
mycobacteria was observed. Unexpectedly, the amino acid sequence showed restricted 
homology to Mucein like proteins from eucaryotes. The 77kDa protein is highly rich in amino 
acid glycine and may be a cell wall protein of Mycobacterium tuberculosis. Such proteins have 
not been reported from M. tuberculosis. 

23. DNA finger-printing: 
The ca. 4kb Sail fragment was used to probe (Southern hybridization) genomic DNA of 
different mycobacteria digested by PvuII (Fig. 1 1). The results show that each strain showed a 
characteristic pattern making the differentiation of M. tuberculosis-Rv, M. tuberculosis-Ra, M. 
bovis and the M. tuberculosis Erdman strain. The ca. 4kb Sail fragment is also suitable for 
finger printing of clinical isolates, since hybridization of the probe to the genomic DNA of 
clinical isolates from tuberculosis patients also yielded strain specific finger print (Fig. 12). No 
hybridization to the genomic DNA of M. smegmatis, M. vaccae, M. avium, M. chelonie, M. 
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fortituim, M. phlei was observed. 
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Claims 



1. (I) DNA 

(a) having sequence (I) according to figure 9, wherein 
optionally one or more condons can be replaced by condons coding 
for the same amino acid(s), 

(b) having a sequence complementary to said of (a) , 

(c) being single-stranded, wherein its strand is hybridizable 
with that of the DNA according to (a) or (b) , 

(d) being double -stranded, the sequences of its single strands 
being defined as in (a) and (b) , respectively, 

(e) being double -stranded, its single strands being hybridizable 
with those of the DNA according to (d) , or 

(f) being a subsequence of the sequences according to (a) to 
(e) ; or 

(II) DNA 

(a) having sequence (VI) according to figure 2, wherein 
optionally one or more condons can be replaced by condons coding 
for the same amino acid(s), 

(b) having a sequence complementary to said of (a) , 

(c) being single -stranded, wherein its strand is hybridizable 
with that of the DNA according to (a) or (b) , 

(d) being double -stranded, the sequences of its single strands 
being defined as in (a) and (b) , respectively, 

(e) being double -stranded, its single strands being hybridizable 
with those of the DNA according to (d) , or 

(f) being a subsequence of the sequences according to (a) to 
(e) ; or 

(III) DNA 

(a) having sequence (IX) according to figure 3, wherein 
optionally one or more condons can be replaced by condons coding 
for the same amino acid(s), 
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(b) having a sequence complementary to said of (a) , 

(c) being single-stranded, wherein its strand is hybridizable 
with that of the DNA according to (a) or (b) , 

(d) being double -stranded, the sequences of its single strands 
being defined as in (a) and (b) , respectively, 

(e) being double -stranded, its single strands being hybridizable 
with those of the DNA according to (d) , or 

(f) being a subsequence of the sequences according to (a) to 
(e) . 

2. A DNA according to claim 1 (I)(c), (I) (e) , (II) (c) , (II) (e) , 
(III) (c) or (III) (e) , its single strands being hybridizable at a 
temperature of at least 25 °C and at a concentration of NaCl of 
1 M. 

3. RNA being a transcript of a DNA according to claim 1 or 2. 

4. Protein being encoded by a DNA according to claim 1 or 2. 

5. Protein having the amino acid sequence (II) according to 
figure 13 . 

6. An about 74 kDa protein according to claim 4 or 5. 

7. Protein having the amino acid sequence (III) according to 
figure 14 . 

8. An about 77 kDa protein according to claim 4 or 7. 

9. Protein having the amino acid sequence (IV) according to 
figure 15 . 

10. An about 9 kDa protein according to claim 4 or 9. 
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11. Protein having the amino acid sequence (V) according to 
figure 16. 

12. An about 55 kDa protein according to claim 4 or 11. 

13. Protein having the amino acid sequence (VII) according to 
figure 5 . 

14. An about 15 kDa protein according to claim 4 or 13. 

15. Protein having the amino acid sequence (VIII) according to 
figure 6 . 

16. An about 31 kDa protein according to claim 4 or 15. 

17. Protein having the amino acid sequence (X) according to 
figure 7 . 

18. An about 17 kDa protein according to claim 4 or 17. 

19. A protein according to any of claims 4 to 18, wherein the 
protein is a recombinant protein, especially a protein produced 
by means of a bacterial strain, a yeast strain, a fungal strain 
or a cell line of a higher eucaryote. 

20. A protein being encoded by a DNA sequence according to claim 
1 or 2 and which can be recovered by a method comprising the 
following steps: 

(i) subjecting proteins encoded by said DNA sequence to a 
usual test for diagnosis of tuberculosis, 

(ii) selecting a protein showing an inhibitory effect and 

(iii) isolating and recovering said protein. 



21. DNA according to claim 1 or 2, RNA according to claim 3 or 
protein according to^ any of claims 4 to 20 which can be used for 
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(i) diagnosis of tuberculosis in humans and animals and/or 

(ii) diagnosis of other mycobacterial infections in humans or 
animals, 

each especially by means of samples taken from humans or 
animals. 

22. Use of a DNA according to any of claims 1, 2 or 21 for the 
identification of mycobacteria in media samples. 

23. Use according to claim 22, comprising the steps of 

(i) isolating the mycobacterium, 

(ii) preparing crude or purified genomic DNA, 

(iii) hybridizing it to a DNA according to claim 1 or 2 and 

(iv) detecting the fragment pattern using conventional methods 
such as a radioactivity assay, chemiluminiscence or 
fluorescence . 

24. Use according to any of claims 22 to 23, wherein as samples 
clinical samples are used. 

25. Use of a DNA according to any of claims 1 to 2 or of a 
protein according to any of claims 4 to 20 for 

(i) epidemeological purposes and/or 

(ii) vaccination follow-up 

for humans or animals suffering from mycobacterial infections, 
especially tuberculosis . 

26. Use of a DNA according to any of claims 1 to 2 or of a 
protein according to any of claims 4 to 20 for the development 
of drugs useful for combating mycobacterial infections of humans 
or animals, especially tuberculosis, especially for testing and 
recovering of substances inhibiting mycobacterial infections in 
humans and animals, especially tuberculosis. 
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7 7 7 7 55 Fie 9 

GTCGACGTCT ACCGCACCTT CGTCGGCGAG ATGGACGACG AAGAGGCCGA j/g 

6 |° 7 ,° 80 90 100 

CCATCATTAC CGCGCGGGCA TGGCGATGGG CACCACGTTC CAGGTGCCGC 

i° l f If If 150 

CGCAGATGTG GCCACCGGAT CGGGCGGCCT TCGACCGCTA CTGGCGGCAA 
T 170 "o 190 200 

TCACTGGACA GGGTGCACAT CGATGACGTC GTTCGCGACT ACCTGTATCC 
2 )° 2 f "0 "0 250 

GATCGTGGCG CTCCGAATTC GCGGGATCGC ACTGCCGGGT CCGCTGCGGC 



2 f° 2 ]° 280 2 90 aoo 

GGCTCTCGGA GGGTATCGCG CTGCTGATCA CCACCGGTTT CCTGCCGCAG 

330 3|0 350 



CGGTTTCGCG ACGAGATGCG GTTGCCGTGG GACGCGACCA AGCAGCGGCG 

3 t° 3 ]° «0 390 400 

CTTTGACGCG CTCATGGCCG TGCTGCGCAC GGTGAATCGC CTGATGCCGC 

') 0 f 430 440 450 

GGTTTGTCCG GCAGTTCCCG TTCAACCTGA TGCTCTGGGA CCTGGACCGG 

T 4 ]° 480 « ? 0 500 

CGGATGAGGC GCGGGCGCCC GCTGGTGTAA TCGACGGCTT CGCGTGGACC 

T 5 f "0 540 

GATGGCGCTA GACCGCTCGC TAGATTGGCG GGCGAATTTG GTGCACAGAG 

5 f° 5 ]° 588 590 600 

GCAAACCGGG CGAAATCCCT ATCCAGGCTC ACCACGGCGC AGTGATGCTC 

T 620 f «t° 6 r> 

CACGGCGATG GCCCCGAGTA CCGCGTCAGG TATCAAGTCG CCCGATGCCT 

T 6 I° 6 ?0 6?0 700 

CGGCCTCGTC GCAGAGTTTO CGCAGCAGCA CCAGGTGTCT GGGGCCGGGG 

7 f 'f 7 ? 0 750 

CTTGTCGGAA GGTCATGGGG CTGGGCGTTG ACGGCTTCGA CGAATGCGAA 

7 |° 770 788 790 000 

TGCATCCGCT CGTGGTGACG GAATCTCGAA GATGCGTCGA TTCGTTGTTA 

°i° 820 "0 0? 0 

GCCGGAGGAA CGACGCCCAC ACTAGGTTCG GCACTGTGAA GGGGTCGTCG 

T 870 888 a ? o 

GCCGCAAGCA GTCGATCGAA CCAGGGGCGG ACGGTTCGGT GATTCGGATG 
T T »f 9 ? 0 950 

GTCACCGCGG TGTGOAGCCA GCAGCACGTT GACGTCGATG AGGAACATCG 
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9 f° 970 3f0 990 1000 

CCTATTTGTG CCTGTCCAGG CTCACTTCCG CGAGTTCAGT TCCAGACCCT 



10 , 10 10 | 20 10 , 30 1040 1050 

CGTCGAGCAC TTCGGACAAC ACCGTATTCG AGGTTAGGTC GATACCTGGC 

10 , 60 10 , 70 1080 109 o noo 

CGCGGACCGG TGCCGGCGTC AAAAACGGGG ACGGTTGGCC GGGCGCCGCC 

T n | 3 ° "l 40 1150 

GGTACGGGCG GCGGCGAGCT CCCGCCGAAG GGCGTCTTCG ATCACAGCGC 

t t n , 8 ° "i 90 1200 

CCAGCGATTA ACCACGCTCG CGGGCCCGGC GTTTGGCGGT AGCCAGTAGT 



2/5 



12 | 10 12 | 2 ° 1230 

TCATCCGAGA TTGACACGGT GGTGCGCATG ATGCTCAGGA TAGCGCATCT 



12 | 6 ° 12 , 70 12,0° "?0 1300 

ACGGCATCAT CTGCGGTGAG CAACTGATGC CCTCAACGCC GCGTGTGGTC 

"l 10 13 | 2 ° 1330 13 , 40 "SO 

GCAGGTCTGC CTGCTATGGC AAGCCGTTGA GTCCGTTCTC GCCGAGCAGC 

13 | 6 ° 13 , ?0 ",80 1390 1400 

AGCCCGCCGG TGCCGCCGGC ACCGGGCGTG GCCCCGGCTT TGCCGGCGTT 



14 , 10 1420 1430 

GGCGCCGTTG CCGCCGTTGC CGATCAGCAC GGCGTTGCCG CCGACACCaI 

T T 14 , 8 ° 14 ?° 150 

CGCTGCCGCC GGTACCGGCG CCAAACCCGC CGGCAACCCC CGTCACCGCC 



T 15 | 2 ° 15 , 30 ».«> 1550 

GTTGCCGAAC ACCCCGGCGT GGCCACCGTC ACCGCCGGTG CCGCCGGTAC 

T 1570 »fO 1590 1600 

CGGCGCCTAG AGCGTTGGCA CCCCTGCCGC GGGCGCCGCC GGCGCCGGCG 

16 | 10 1620 "fo 

GAGCCGAAGA GCAAGCCGCC GTTCCCGCCG GCGCCGCCGG CGCCGCCTTC 

T 16 , 7 ° 16 , 8 ° 1690 "oo 

CTGGATGCTG GTAAGTGCTG CCCCGCCGTG CCCGCCGGCG CCGCCGGCGC 



17 , 10 "20 1730 

CGCCGAAGCC GAAGAGTAAG GCGCCGTTCC CGCCGGTTCC GCCGGCCCCG 

T 17 | 7 ° 17 ,80 

CCGGCAAGGG AGCTGGCGCC ACCGCTGCCG CCGGCGCCAC CGGAGGCGCC 

T T 18 .30 

GAGGGAGAGT AGGCCGGCGT TGCCGCCGTG CCCGCCGCCG CCGGTGGTGA 



"l 60 18 | 7 ° "00 1890 19 00 

TCCCGGACCC TCCCGAGCCG GCGGCGCCGC CGCTGCCGCC GGCTCCGAAC 
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19 , 10 19 | 2 ° 19 | 3 ° 19 , 4 ° "l 50 9 

AGTCCGCCGT TCCCGCCGTT CCCACCGGCC CCGAAGTTCG TGCCGGCCCC 3/5 

1960 19 , 70 1980 1990 2000 

GCCGGTGCCG CCAGTTCCGA ACAGTCCGCC GTTCCCGCCG TTCCCGCCGG 

20 ! 10 20 | 20 2038 2040 2050 

CTGCGTTGAA CCCGCCGGCC CCTCCGGCTC CGCCGTTGGC GAACAGTCCG 

2868 20 j 7 ° 20 | 80 2090 2100 

CCGTTGCCGC CGGCGCCGCC GACGCCGGCC GGGACACCGC CAGCGGCGCC 

2110 2120 2130 2140 2150 

GTGGCCGCCG GTGCCGGCCG CGCCGAAGAG CAAACCGGCG TCGCCGCCGC 

21 | 60 21 | 7 ° 21 , B0 2190 2200 

GCCCGCCGGC CCCGCCGATG CCAGCGACGC CTATGGAGTT CCCACCGTTC 

2210 2220 2230 2240 2250 

CCGCCGGTGC CGCCGGATCC GATCAGCAAG GAGACCCCAC CGGCGCCGCC 

2260 2270 2280 2290 2300 

GGCCCCGCCG ATCCCTCCAG CACCGGTGCC TATCCCGCCG GTCCCGCCAT 

23 ( 10 2320 2330 2340 2350 

TGCCACCGGT ACCGAACAAG ATCCCGCCGG CCCCGCCGGC CCCGCCCGTA 

23 , 60 2 Y° 23 j 80 2390 2400 

GCCGTGGCGG CGGTGTTGGT CGCACCGTGC CCGCCGTTAC CGCCGTTGCC 

241° 2420 2430 2440 2450 

GAACAACCAC CCGCCGGCCC CGCCGGCAGC CCCGGTCCCC GGGGTCCCGT 

2460 2470 2480 2490 2500 

TGGCGCCGTT GCCGAACAGC CACCCGCCGG CCCCGCCGTC AGCCCCGGTT 

2510 2520 2530 2540 2550 

CCAGGAGTCC CGTTGGCGCC GTTGCCGATC AGCGGGCGGC CGGTGAGCGT 

25 | 60 25 , 70 25 ! 8<) 2590 2600 

CTGGAAGGGC TCGTTCACCA CATTGAGCAC ATTTTGCTGC AGGGTGTGCA 



26 , 10 26 | 2 ° 26 , 30 2640 2650 

GTGGCGAGGT GCTCGCGGGA GCATTGAATC CGTCTAGACC GAGCAGGAGC 

26 , 6 ° 26 | 7 ° 26 , 80 2690 2700 

CCGCTGACGA CGACCACTCC GGCCTTGCCC GCGCCAATCC CACCGCTACC 

27 , 10 27 , 20 2738 2740 2750 

GCCGTTACCG CCATTGCCGA TCAACACGGC GGTGCCACCG ATCCCGCCCT 

27 , 6 ° 27 , 70 2788 2790 2800 

TGCCGCCGGT CACCGCGCTG GCGCCACCGT TACCGCCGTT GGCGCCGTTA 

28 , 10 28 , 2 ° 28 | 3 ° 2840 2850 

CCGATCAGCC CGGGGGTGCC GCCAGCCCCA CCGATCCCGC CGGGGAAGCC 
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2860 2870 2880 2890 2900 

I I I I | Fig 

CTGGACAACT CCGCCGTTGG CGCCGGCGCC GCCGGAGCCG AAGACCGTGC 



2910 2920 2930 2940 2950 

I I I I I 

CGGTGTTGCC CCCGGGGCCG TCTTGCCCGC CGTCGGAGAA GCCGAATCCG 

2960 2970 2980 2990 3000 

I I I I I 

CCGGCGCCGC CGGAGCCGCC GGAGCCGAAG AGCAGCCCAG CGTTGCCGCC 

3010 3020 3030 3040 3050 

I I I I | 

GGCGCCGCCG GCGCCGtCTA TGCCGTCGGC CGTGAGAGTA CCGCCGTCCC 

3060 3070 3080 3090 3100 

I I I I I 

CACCGATTCC GCCGGCGCCG CCCGCGGCGC CGAGGGCGAG CATGCCGGCA 

3110 3120 3130 3140 3150 

! I I | | 

TTGCCGCCGG CCCCGCCGTC CCCGCCGGCG ACCAGGCTGT GTCCGCCGCT 

3160 3170 3180 3190 3200 

I I I I | 

GCCGCCTTCC CCGCCTGCGC CGAACAGCCC GCCGGCCCCG CCGGCCCCGC 

3210 3220 3230 3240 3250 

I I I I I 

CGACTCCGCC GAAGCTGCTG TCGGCGAACC CGCCATGCCC GCCGGTGCCG 

3260 3270 3280 ' 3290 3300 

I I I | I 

CCGGCGCCGA ACAGACCGCC AGCGCCACCG GCCCCACCGG CCCCGCCGGA 

3310 3320 3330 3340 3350 

I I I I I 

GCTGCCGGCC CCACCGGATC CGCCGACCCC GCCGGTGGCG AACAGCCCGC 

3360 3370 3380 3390 3400 

1 I I | I 

CGGCCCCGCC GGCGCCGCCC GCCCCGCCGA GTGCACTGCC GTTCGTGAAT 

34^10 3420 3430 3440 3450 

CCGCCGGCCC CGCCGACTCC GGCGGCGCCG AAGAGCAGGC CGGCGTTGCC 

3460 3470 3480 3490 3500 

I I I I I 

GGCAGCCCCG CCGGCGCCGC CGGCCCCGCC CGTGAGGGCT ACTACGCCGC 

3510 3520 3530 3540 3550 

•'III 
CGCCGGCGCC GCCGGCGCCG CCGGCGCCGA ACAGCATGGC GTTGCCGCCG 

3560 3570 3580 3590 3600 

I I I | I 

GCTCCGCCGG ACCCGCCGAT CCCACTGCTG GCGACCCCGC CAGCGCCGCC 

3610 3620 3630 3640 3650 

GGCGCCGCCG TTGCCGATGA GCCCGCCGGC GCCGCCGTTG CCGCCGGCCG 

3660 3670 3680 3690 3700 

I I I | I 

CGCCGGATCC TCCGGCGCCG CCGTTGACGA TTAACCAGCC GCCGTCCCCG 

3710 3720 3730 3740 3750 

CCATTGGCCC CGGTGCCGGG GGCGCCGTTG GCGCCGTTGC CGATCAACGG 

3760 3770 3780 3790 3800 

I I I I I 

GCGCCCGGTA TTCGCCAGGA AGAACTCGTT GATCGGATCC AGCAGCGGCG 
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3810 3820 3830 3840 3850 Pig. 9 

ACACCGCGGC GGCCTCGGCG GCCGCATAGG CGCCGCCACC GGAGGTCAAT 5/5 

3860 3870 3880 3890 3900 

GCCTGCACGA ACTGGGCATG AAACGCCTGC GCTTGGGCGC TGAGCGCCTG 

3910 3920 3930 3940 3950 

I I I | I 

ATAGGCCTGG CCGTGGGCGC CGAACAGCGC GGCGATGGCT GTCGAC 
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Fig. 1 1 A. Southern hybridization with genomic DNA from different 
mycobacteria digested with PvuII (1. M. tuberculosis H37Rv, 2. M. avium, 3. 
M. kanssasi, 4. M. necroti, 5. M.fortuitum, 6. M. phlei, 7. M. smegmatis, 8! M. 
vaccae. ) 



Fig. 11B 

Finger-print obtained using the DNA (BamHl digest) of l.M. tuberculosis H37 
RV , 2. M. tuberculosis H37 Ra, 3. M. bovis BCG, and 4. M. tuberculosis 
H37Rv digested with Sal I. 
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Fig 12. Finger-print with DNA fron differents M. tuberculosis clinical isolates 
( numbered 1-12) digested with PvuII restricion enzyme. The 4 Kb Sal I 
fragment (Mtub-Klar-Klon) was used as probe. 
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Amino acid sequence of the protein of about 74kDa. 
Molecular-weight 74999 ; Length 764 
THE AMINO ACID SEQUENCE IS GIVEN BELOW: 



10 


20 


30 


40 


50 


60 


VPPVPAPRAL 


APLPPAPPAP 


AEPKSKPPFP 


PAPPAPPCWM 


LVSAAPPCPP 


APPAPPKPKS 


70 


80 


90 


100 


110 


120 


KAPFPPVPPA 


PPARELAPPL 


PPAPPEAPRE 


SRPALPPCPP 


PPWIPDPPE 


PAAPPVPPAP 


130 


140 


150 


160 


170 


180 


NSPPFPPFPP 


APKFVPAPPV 


PPVPNSPPFP 


PFPPAALNPP 


APPAPPLANS 


PPLPPAPPTP 


190 


200 


210 


220 


230 


240 


AGTPPAAPWP 


PVPAAPKSKP 


ASPPRPPAPP 


MPATPMEFPP 


LPPVPPDPIS 


KETPPAPPAP 


250 


260 


270 


280 


290 


300 


PIPPAPVPIP 


PVPPLPPVPN 


KlfrArPAPP 


VAVAAVLVAP 


CPPLPPLPNN 


HPPAPPAAPV 


310 


. 320 


330 


340 


350 


360 


PGVPLAPLPN 


SHPPAPPSAP 


\7DfM7DT &DT O 


ISGRPVSVWK 


GSFTTLSTFC 


CRVCSGEVLA 


370 


380 


390 


400 


410 


420 


GALNPSRPSR SPLTTTTPAL 


PAPIPPLPPL 


PPLPINTAVP 


PIPPLPPVTA 


LAPPLPPLAP 


430 


440 


450 


460 


470 


480 


LPISPGVPPA 


PPIPPGKPWT 


TPPLAPAPPE 


PKTVPVLPPG 


PSCPPSEKPN 


PPAPPEPPEP 


490 


500 


510 


520 


530 


540 


KSSPALPPAP 


PAPSMPSAVR 


VPPSPPIPPA 


PPAAPRASMP 


ALPPAPPSPP ATRLCPPLPP 


550 


560 


570 


580 


590 


600 


SPPAPNSPPA 


PPAPPTPPKL 


LSANPPCPPV 


PPAPNRPPAP 


PAPPAPPELP APPDPPTPPV 


610 


620 


630 


640 


650 


660 


ANSPPAPPAP 


PAPPSALPFV 


NPPAPPTPAA 


PKSRPALPAA 


PPAPPAPPVR 


ATTPPPAPPA 


670 


680 


690 


700 


710 


720 


PPAPNSMALP 


PAPPDPPIPL 


LATPPAPPAP 


PLPMSPPAPP 


LPPAAPDPPA 


PPLTINQPPS 


730 


740 


750 


760 


770 


780 


PPLAPVPGAP 


LAPLPINGRP 


VFARKNSLIG 


SSSGDTAAAS 


AAA* 
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A glycine rich protein of about 77kDa. 

Molecular-weight 77056 ; Length 899 

THE AMINO ACID SEQUENCE IS GIVEN BELOW: 

10 20 30 40 50 60 

STAIAALFGA HGQAYQALSA QAQAFHAQFV QALTSGGGAY AAAEAAAVSP LLDPINEFFL 
70 80 90 100 110 120 

ANTGRPLIGN GANGAPGTGA NGGDGGWLIV NGGAGGSGAA GGNGGAGGLI GNGGAGGAGG 
130 140 150 160 170 180 

VASSGIGGSG GAGGNAMLPG AGGAGGAGGG WALTGGAGG AGGAAGNAGL LFGAAGVGGA 
190 200 210 220 230 240 

GGFTNGSALG GAGGAGGAGG LFATCGVGGS GGAGSSGGAG GAGGAGGLFG AGGTGGHGGF 
250 260 270 280 290 300 

ADSSFGGVGG AGGAGGLFGA GGEGGSGGHS LVAGGDGGAG GNAGMLALGA AGGAGGIGGD 
310 320 330 340 350 360 

GGTLTADCID GAGGAGGNAG LLFGSGGSGG AGGFGFSDGG QDGPGGNTGT VFGSGGAGAN 
370 380 390 400 410 420 

GGWQGFPGG IGGAGGTPGL IGNGANGGNG GASAVTGGNG GIGGTAVLIG NGGNGGSGGI 
430 440 450 460 470 480 

GAGKAGWW SGLLLGLDGF NAPASTSPLH TLQQNVLNW NEPFQTLTGR PLIGNGANGT 
490 500 510 520 530 540 

PGTGADGGAG GWLFGNGANG TPGTGAAGGA GGWLFGNGGN GGHGATNTAA TATGGAGGAG 
5S0 560 570 580 590 600 

GILFGTGGNG GTGGIGTGAG GIGGAGGAGG VSLLIGSGGT GGNGGNSIGV AGIGGAGGRG 
610 620 630 640 650 660 

GDAGLLFGAA GTGGHGAAGG VPAGVGGAGG NGGLFANGGA GGAGGFNAAG GNGGNGGLFG 
670 6fl 0 690 700 710 720 

TGGTGGAGTN FGAGGNGGNG GLFGAGGTGG AAGSGGSGIT TGGGGHGGNA GLLSLGASGG 
730 740 750 760 770 780 

AGGSGGASSL AGGAGGTGGN CALLFGFGGA CGACGMGGAA LTSIQQGGAG GAGGNGGLLF 
790 000 010 B20 830 840 

GSAGAGGAGG SGANALGAGT GGTGGDGGHA GVFGNGGDGG CRRVWRRYRR QRWCRRQRRA 
850 860 870 880 890 900 

DRQRRQRRQR RQSRGHARCR RHRRAAARRE RTQRLAIAGR PATTRGVEGI SCSPQMMP* 
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Amino acid sequence of the about 3 kDa proline 
r?.ch protein 

Molecular-weight 9356 ; Length 98 
AMINO ACID SEQUENCE IS GIVEN BELOW: 

10 20 30 40 50 60 

MALPPAPPDP PIPLLATPPA PPAPPLPMSP PAPPLPPAAP DPPAPPLTIN QPPSPPLAPV 

70 80 90 100 110 120 
PGAPLAPLPI NGRPVFARKN SLIGSSSGDT AAASAAA* _ . 
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Proline rich protein of about 55kDa. 
Molecular-weight 55982 ; Length 573 
AMINO ACID SEQUENCE IS GIVEN BELOW. 

10 20 30 40 SO 60 

VPAAPKSKPA SPPRPPAPPM PATPMEFPPL PPVPPDPISK ETPPAPPAPP IPPAPVPIPP 
70 80 90 100 110 120 

VPPLPPVPNK IPPAPPAPPV AVAAVLVAPC PPLPPLPNNH PPAPPAAPVP GVPLAPLPNS 

130 140 150 160 170 180 

HPPAPPSAPV PGVPLAPLPI SGRPVSVWKG SFTTLSTFCC RVCSGEVLAG ALNPSRPSRS 

190 200 210 220 230 240 

PLTTTTPALP APIPPLPPLP PLPINTAVPP IPPLPPVTAL APPLPPLAPL PISPGVPPAP 

250 260 270 280 290 300 

PIPPGKPWTT PPLAPAPPEP KTVPVLPPGP SCPPSEKPNP PAPPEPPEPK SSPALPPAPP 

310 320 330 340 350 360 

APSMPSAVRV PPSPPIPPAP PAAPRASMPA LPPAPPSPPA TRLCPPLPPS PPAPNSPPAP 

370 380 390 400 410 420 

PAPPTPPKLL SANPPCPPVP PAPNRPPAPP APPAPPELPA PPDPPTPPVA NSPPAPPAPP 

4 30 440 450 460 470 480 

APPSALPFVN PPAPPTPAAP KSRPALPAAP PAPPAPPVRA TTPPPAPPAP PAPNSMALPP 

"0 500 510 520 530 540 

APPDPPIPLL ATPPAPPAPP LPMSPPAPPL PPAAPDPPAP PLTINQPPSP PLAPVPGAPL 

550 560 570 580 590 600 
APLPINGRPV FARKNSLIGS SSGDTAAASA AA* 



